Simplifying Model Trees with Regression and Splitting Nodes
نویسندگان
چکیده
Model trees are tree-based regression models that associate leaves with linear regression models. A new method for the stepwise induction of model trees (SMOTI) has been developed. Its main characteristic is the construction of trees with two types of nodes: regression nodes, which perform only straight-line regression, and splitting nodes, which partition the feature space. In this way, internal regression nodes contribute to the definition of multiple linear models and have a “global” effect, while straight-line regressions at leaves have only “local” effects. In this paper the problem of simplifying model trees with both regression and splitting nodes is faced. In particular two methods, named Reduced Error Pruning (REP) and Reduced Error Grafting (REG), are proposed. They are characterized by the use of an independent pruning set. The effect of the simplification on model trees induced with SMOTI is empirically investigated. Results are in favour of simplified trees in most cases.
منابع مشابه
Comparing Simplification Methods for Model Trees with Regression and Splitting Nodes
In this paper we tackle the problem of simplifying tree-based regression models, called model trees, which are characterized by two types of internal nodes, namely regression nodes and splitting nodes. We propose two methods which are based on two distinct simplification operators, namely pruning and grafting. Theoretical properties of the methods are reported and the effect of the simplificati...
متن کاملSimplification Methods for Model Trees with Regression and Splitting Nodes
Model trees are tree-based regression models that associate leaves with linear regression models. A new method for the stepwise induction of model trees (SMOTI) has been developed. Its main characteristic is the construction of trees with two types of nodes: regression nodes, which perform only straight-line regression, and splitting nodes, which partition the feature space. In this way, intern...
متن کاملStepwise Induction of Logistic Model Trees
In statistics, logistic regression is a regression model to predict a binomially distributed response variable. Recent research has investigated the opportunity of combining logistic regression with decision tree learners. Following this idea, we propose a novel Logistic Model Tree induction system, SILoRT, which induces trees with two types of nodes: regression nodes, which perform only univar...
متن کاملStepwise Induction of Model Trees
Regression trees are tree-based models used to solve those prediction problems in which the response variable is numeric. They differ from the better-known classification or decision trees only in that they have a numeric value rather than a class label associated with the leaves. Model trees are an extension of regression trees in the sense that they associate leaves with multivariate linear m...
متن کاملMining Tolerance Regions with Model Trees
Many problems encountered in practice involve the prediction of a continuous attribute associated with an example. This problem, known as regression, requires that samples of past experience with known continuous answers are examined and generalized in a regression model to be used in predicting future examples. Regression algorithms deeply investigated in statistics, machine learning and data ...
متن کامل